850 research outputs found

    Concentration inequalities for order statistics

    Full text link
    This note describes non-asymptotic variance and tail bounds for order statistics of samples of independent identically distributed random variables. Those bounds are checked to be asymptotically tight when the sampling distribution belongs to a maximum domain of attraction. If the sampling distribution has non-decreasing hazard rate (this includes the Gaussian distribution), we derive an exponential Efron-Stein inequality for order statistics: an inequality connecting the logarithmic moment generating function of centered order statistics with exponential moments of Efron-Stein (jackknife) estimates of variance. We use this general connection to derive variance and tail bounds for order statistics of Gaussian sample. Those bounds are not within the scope of the Tsirelson-Ibragimov-Sudakov Gaussian concentration inequality. Proofs are elementary and combine R\'enyi's representation of order statistics and the so-called entropy approach to concentration inequalities popularized by M. Ledoux.Comment: 13 page

    Tail index estimation, concentration and adaptivity

    Get PDF
    This paper presents an adaptive version of the Hill estimator based on Lespki's model selection method. This simple data-driven index selection method is shown to satisfy an oracle inequality and is checked to achieve the lower bound recently derived by Carpentier and Kim. In order to establish the oracle inequality, we derive non-asymptotic variance bounds and concentration inequalities for Hill estimators. These concentration inequalities are derived from Talagrand's concentration inequality for smooth functions of independent exponentially distributed random variables combined with three tools of Extreme Value Theory: the quantile transform, Karamata's representation of slowly varying functions, and R\'enyi's characterisation of the order statistics of exponential samples. The performance of this computationally and conceptually simple method is illustrated using Monte-Carlo simulations

    Real-time prediction of severe influenza epidemics using Extreme Value Statistics

    Full text link
    Each year, seasonal influenza epidemics cause hundreds of thousands of deaths worldwide and put high loads on health care systems. A main concern for resource planning is the risk of exceptionnally severe epidemics. Taking advantage of the weekly influenza cases reporting in France, we use recent results on multivariate GP models in Extreme Value Statistics to develop methods for real-time prediction of the risk that an ongoing epidemic will be exceptionally severe and for real-time detection of anomalous epidemics. Quality of predictions is assessed on observed and simulated data

    Predicting extremes: influenza epidemics in France

    Get PDF
    Influenza epidemics each year cause hundreds of thousands of deaths worldwide and put high loads on health care systems, in France and elsewhere. A main concern for resource planning in public health is the risk of an extreme and dangerous epidemic. Sizes of epidemics are measured by the number of visits to doctors caused by Influenza Like Illness (ILI), and health care planning relies on prediction of ILI rates. We use recent results on the multivariate Generalized Pareto (GP) distributions in Extreme Value Statistics to develop methods for real-time prediction of risks of exceeding very high levels and for detection of unusual and potentially very dangerous epidemics. Based on the observation of the two first weeks of the epidemic, the GP method for real-time prediction is employed to predict ILI rates of the third week and the total size of the epidemic for extreme influenza epidemics in France. We then apply a general anomaly detection framework to the ILI rates during the three first weeks of the epidemic for early detection of unusual extreme epidemics. As an additional input to resource planning we use standard methods from extreme value statistics to estimate risk of exceedance of high ILI levels in future years. The new methods are expected to be broadly applicable in health care planning and in many other areas of science and technology

    Dynamics and rheology of vesicles in a shear flow under gravity and microgravity

    No full text
    International audienceThe behaviour of a vesicle suspension in a simple shear flow between plates (Couette flow) was investigated experimentally in parabolic flight and sounding rocket experiments by Digital Holographic Microscopy. The lift force which pushes deformable vesicles away from walls was quantitatively investigated and is found to be rather well described by a theoretical model by Olla [1]. At longer shearing times, vesicles reach a steady distribution about the center plane of the shear flow chamber, through a balance between the lift force and shear induced diffusion due to hydrodynamic interactions between vesicles. This steady distribution was investigated in the BIOMICS experiment in the MASER 11 sounding rocket. The results allow an estimation of self-diffusion coefficients in vesicle suspensions and reveal possible segregation phenomena in polydisperse suspensions

    A mixed-integer heuristic for the structural optimization of a cruise ship

    Full text link
    peer reviewedA heuristic approach is proposed to solve the structural optimization problem of a cruise ship. The challenge of optimization is to define the scantling of the structure of a ship in order to minimize the weight or the production cost. The variables are the dimensions and positions of the constitutive elements of the structure: they are discrete by nature. The objective functions are nonlinear functions. The structure is submitted to geometric constraints and to structural constraints. The geometric constraints are linear functions and the structural constraints are implicit functions requiring a high computation cost. The problem belongs to the class of mixed-integer nonlinear problems (MINLP). A local heuristic of the type “dive and fix” is combined with a solver based on approximation methods. The solver is used as a black-box tool to perform the structural analysis and solve the nonlinear optimization problems (NLP) defined by the heuristic. The heuristic is designed to always provide a discrete feasible solution. Experiments on a real-size structure demonstrate that the optimal value of the mixed-integer problem is of the same magnitude as the optimal value of the optimization problem for which all the variables can take continuous values

    High resolution imaging of massive young stellar objects and a sample of molecular outflow sources

    Get PDF
    This thesis contains a study of millimetre wavelength observations of massive young stellar objects (MYSOs) both via interferometric and single dish observations. First, the high angular resolution observations ( up to ∼0.1”) from a variety of interferometers of the MYSO, S140 IRS1, are presented. This source is one of only two prototypes that have ionised equatorial emission from a radiatively driven disc wind. The observations confirm that IRS1 has a dusty disc at a position angle compatible with that of the disc wind emission, and confirms the disc wind nature for the first time. Secondly, the observations of S140 IRS1 are modelled using a 2D axisymmetric radiative transfer code. Extensive models producing synthetic data at millimetre wavelengths were developed. These models show that on the largest scales, typically accessible with single dish observations or compact interferometric configurations, the spectral energy distribution is relatively unchanged by the addition of a compact dust disc. However, a disc is required to match the interferometric visibilities at the smaller scales. The position angle of the disc is well constrained via a newly developed 2D visibility fitting method. The models however, are degenerate and there are a range of realistic best fitting discs. The third section presents the single dish observations of the core material traced by C18O around 99 MYSOs and compact HII regions from the RMS survey. A method to calculate the core masses and velocity extent is reported. The method is accurate and robust, and can be applied to any molecular line emission. An updated distance limited sample contains 87 sources and is complete to 103 L⊙. It is a representative sample of MYSOs and HII regions. All of the cores harbour at least one massive protostar. Finally, methodologies to establish outflow parameters via 12CO (3-2) and 13CO (3-2) data are investigated. Multiple techniques are trialed for a well studied test source, IRAS 20126+4104, and a repeatable outflow analysis pathway is described. In more complex regions using the 12CO emission to identify outflows and determine the mass is more difficult and an alternative method is suggested. Moreover, the dynamical timescale of the outflows and the dynamical parameters are estimated in a spatial sense rather than using a simple average. Such analysis will aid in categorising different outflows from the full sample

    Exploiting the noise: improving biomarkers with ensembles of data analysis methodologies.

    Get PDF
    BackgroundThe advent of personalized medicine requires robust, reproducible biomarkers that indicate which treatment will maximize therapeutic benefit while minimizing side effects and costs. Numerous molecular signatures have been developed over the past decade to fill this need, but their validation and up-take into clinical settings has been poor. Here, we investigate the technical reasons underlying reported failures in biomarker validation for non-small cell lung cancer (NSCLC).MethodsWe evaluated two published prognostic multi-gene biomarkers for NSCLC in an independent 442-patient dataset. We then systematically assessed how technical factors influenced validation success.ResultsBoth biomarkers validated successfully (biomarker #1: hazard ratio (HR) 1.63, 95% confidence interval (CI) 1.21 to 2.19, P = 0.001; biomarker #2: HR 1.42, 95% CI 1.03 to 1.96, P = 0.030). Further, despite being underpowered for stage-specific analyses, both biomarkers successfully stratified stage II patients and biomarker #1 also stratified stage IB patients. We then systematically evaluated reasons for reported validation failures and find they can be directly attributed to technical challenges in data analysis. By examining 24 separate pre-processing techniques we show that minor alterations in pre-processing can change a successful prognostic biomarker (HR 1.85, 95% CI 1.37 to 2.50, P < 0.001) into one indistinguishable from random chance (HR 1.15, 95% CI 0.86 to 1.54, P = 0.348). Finally, we develop a new method, based on ensembles of analysis methodologies, to exploit this technical variability to improve biomarker robustness and to provide an independent confidence metric.ConclusionsBiomarkers comprise a fundamental component of personalized medicine. We first validated two NSCLC prognostic biomarkers in an independent patient cohort. Power analyses demonstrate that even this large, 442-patient cohort is under-powered for stage-specific analyses. We then use these results to discover an unexpected sensitivity of validation to subtle data analysis decisions. Finally, we develop a novel algorithmic approach to exploit this sensitivity to improve biomarker robustness

    Suicide assisted by right-to-die associations: a population based cohort study

    Get PDF
    Background: In Switzerland, assisted suicide is legal but there is concern that vulnerable or disadvantaged groups are more likely to die in this way than other people. We examined socio-economic factors associated with assisted suicide. Methods: We linked the suicides assisted by right-to-die associations during 2003-08 to a census-based longitudinal study of the Swiss population. We used Cox and logistic regression models to examine associations with gender, age, marital status, education, religion, type of household, urbanization, neighbourhood socio-economic position and other variables. Separate analyses were done for younger (25 to 64 years) and older (65 to 94 years) people. Results: Analyses were based on 5 004 403 Swiss residents and 1301 assisted suicides (439 in the younger and 862 in the older group). In 1093 (84.0%) assisted suicides, an underlying cause was recorded; cancer was the most common cause (508, 46.5%). In both age groups, assisted suicide was more likely in women than in men, those living alone compared with those living with others and in those with no religious affiliation compared with Protestants or Catholics. The rate was also higher in more educated people, in urban compared with rural areas and in neighbourhoods of higher socio-economic position. In older people, assisted suicide was more likely in the divorced compared with the married; in younger people, having children was associated with a lower rate. Conclusions: Assisted suicide in Switzerland was associated with female gender and situations that may indicate greater vulnerability such as living alone or being divorced, but also with higher education and higher socio-economic positio
    corecore